Disambiguation and Unknown Term Translation in Cross Language Information Retrieval
نویسندگان
چکیده
In this paper we present a report on our participation in the CLEF 2007 Chinese-English ad hoc bilingual track. We discuss a disambiguation strategy which employs a modified co-occurrence model to determine the most appropriate translation for a given query. This strategy is used alongside a pattern-based translation extraction method which addresses the ‘unknown term’ translation problem. Experimental results demonstrate that a combination of these two techniques substantially improves retrieval effectiveness when compared to various baseline systems that employ basic co-occurrence measures with no provision for out-of-vocabulary terms.
منابع مشابه
Using Structured Queries for Disambiguation in Cross-Language Information Retrieval
Bilingual transthr dictionaries are an important resource for query translation in cross-language text retrieval. However, term translation is not an isomorphic process, so dictionary-based systems must address the problem of ambiguity in language translation. In this paper, we claim that boolea~l conjunction (the AND operator) provides siml)le and automatic disambiguation in the target languag...
متن کاملAmbiguity and Unknown Term Translation in CLIR
In this paper we present a report on our participation in the CLEF Chinese-English ad hoc bilingual track, and we discuss a disambiguation strategy which employs a modified co-occurrence model to determine the most appropriate translation for a given query. This strategy is used alongside a pattern-based translation extraction method which addresses the ‘unknown term’ translation problem. Exper...
متن کاملPhrase Identification in Cross-Language Information Retrieval
Term-sense ambiguity and the difficulty in translating phrases are the main sources of problem in dictionarybased cross-language information retrieval (CLIR) approaches. We propose a term similarity-based translationphrase identification technique to enhance the retrieval effectiveness of a dictionary-based query translation method. The technique identifies noun-phrases in the target language b...
متن کاملCross-Language Retrieval with Wikipedia
We demonstrate a twofold use of Wikipedia for cross-lingual information retrieval. As our main contribution, we exploit Wikipedia hyperlinkage for query term disambiguation. We also use bilingual Wikipedia articles for dictionary extension. Our method is based on translation disambiguation; we combine the Wikipedia based technique with a method based on bigram statistics of pairs formed by tran...
متن کاملGetting Information from Documents You Cannot Read: An Interactive Cross-Language Text Retrieval and Summarization System
In this paper we discuss research designed to investigate the ability of users to find information in texts written in languages unknown to them. One study shows how document thumbnail visualizations can be used effectively to choose potentially relevant documents. Another study shows how a user of a cross-language text retrieval system who has no foreign language knowledge can never-the-less c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007